Publishing Data Online

Lizzie Scholtus

Institut für Ur- und Frühgeschichte

Why publishing data?

Why publishing data online?

  • To share data
    • Open Science
    • FAIR Principles

Why publishing data online?

What is Open Science?

  • Free access to:
    • Scientific publications
    • Research data and metadata
    • Hypothesis, methodologies, protocols, codes, formats and results
  • Participative science

Pillars of the Open Science
according to UNESCO’s 2021 Open Science recommendation.
Source: Wikipedia

Why publishing data online?

What is Open Science?

French national plan for Open Science
Source: French Ministry of Higher Education and Research

Why publishing data online?

What are FAIR principles?

Why publishing data online?

  • To share data
    • Open Science
    • FAIR Principles
  • To save data
    • Digital data can die
      • Accident
      • Software evolution
    • Version control

Where to publish data?

Where to publish data?

  • Institutional
    • EU, Country, University
  • non-specialist
  • Private
    • Free or not
  • Field specialised
  • To create your database (Heurist)

Where to publish data?

Three platforms, Three uses

Git Platforms

Git Platforms

  • Oldest (2008)
  • More people
  • Bought by Microsoft
  • Less ready to use
    • Users need to pay to integrate elements themselves from third-party application

  • Open source
  • Completely free at the beginning (not any more)
  • continuous integration and DevOps workflows
  • Need a paid account for some functionalities

Gitlab

Gitlab

Description

  • Dedicated to versioning
  • Collaborative work
  • Code development

Inconvenients

  • works with command lines
  • Only a depot, not a real publication

Advantages

  • Branch system
  • Continuous integration
  • Deposit can be private or public
  • Licence attribution
  • institutional instance exist

Zenodo

Description

  • Dedicated to dataset deposit
  • Code can also be deposit
  • Created By CERN

Inconvenients

  • Not possible to modify a file
  • Fix deposit as for a publication

Advantages

  • DOI (Digital object identifier)
  • Enables easy citation
  • Various metadata
  • Version control
  • Deposit can be private or public
  • Licence attribution

ArkeoGIS

Description

  • Geographical informatic system online
  • specific to data of the past
  • Created by Strasbourg university

Inconvenients

  • Support only spatial data
  • Only datasets
  • Data need to be aligned to ArkeoGIS structure

Advantages

  • Data link with other datasets
  • Automatic language alignement
  • Automatic chronological alignement
  • LOD (Link Open Data)
  • DOI

ArkeoGIS

Description

  • Geographical informatic system online
  • specific to data of the past
  • Created by Strasbourg university

Inconvenients

  • Support only spatial data
  • Only datasets
  • Data need to be aligned to ArkeoGIS structure

Advantages

  • Data link with other datasets
  • Automatic language alignement
  • Automatic chronological alignement
  • LOD (Link Open Data)
  • DOI
  • Data under password login or completly open

What is LOD?

What is Semantic Web?

  • Structure and link information online
  • Using ontologies to build conceptual data modelling
  • Enables data browsing

Conceptual model based on CIDOC CRM ontology

What does this mean for you?

What does this mean to you?

  • The first step is to structure your data
  • Always precise your data with metadata
  • If possible use free format
  • Place your data in permanent backup structures (institutional or not) >> DMP
  • Specify licences and where possible use the most open ones

The French Data Management Environment

A government strategy

  • National plan for Open Science
  • Law for a numerical Republic
  • Aim:
    • Research Data Management on a national level
    • Dissemination and reproducibility